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Abstract 

This paper describes an algorithm for selecting pa- 
rameter values (e.g. temperature values) at which to 
measure equilibrium properties with Parallel Tem- 
pering Monte Carlo simulation. Simple approaches 
to choosing parameter values can lead to poor equi- 
libration of the simulation, especially for Ising spin 
systems that undergo l s t-order phase transitions. 
However, starting from an initial set of parameter 
values, the careful, iterative respacing of these val- 
ues based on results with the previous set of values 
greatly improves equilibration. Example spin sys- 
tems presented here appear in the context of Quan- 
tum Monte Carlo. 



1 Introduction 

This paper describes our experience with and mod- 
ifications to an algorithm to enhance Parallel Tem- 
pering (PT) Monte Carlo |T] known as Feedback Op- 
timized Parallel Tempering (FOPT) [3 . Though 
our experimental analysis will focus on the Quan- 
tum Ising Spin Glass in a transverse field, we shall 
try to keep the concepts as general as possible as we 
believe that workers in as diverse fields as Bayesian 
Statistics, Computer Science, and Molecular Simu- 
lation may be interested in optimizing PT on dif- 
ficult problems. While PT has proved itself to be 
an indispensable tool in computer simulation, cer- 
tain aspects of some problems can present serious 
obstacles to its use. The idea of FOPT is to try to 
eliminate the barriers in replica-diffusion space that 
can hinder the decorrclative effect that PT is sup- 



posed to overcome. Our initial experiments with 
PT on the Quantum Ising Glass encountered pre- 
cisely such difficulties, but we found it necessary to 
make some adaptations to FOPT for it to function 
robustly. 

To ease the presentation to those unfamiliar with 
Quantum Monte Carlo (QMC) or spin systems but 
knowledgeable with Markov Chain Monte Carlo 
(MCMC, also called dynamical Monte Carlo) sim- 
ulation, the specialized, physics-oriented concepts 
are confined to Section |4l which briefly reviews the 
Quantum Ising Glass and how it is simulated. Un- 
interested readers can skip that section and simply 
imagine that within is derived a set of intractable 
high-dimensional distributions (which in this case 
are defined over a binary- valued state space) that 
we wish to sample from. 

Section [2] reviews the ideas of PT and FOPT; 
our contributions to improving the method are dis- 
cussed in Section [3] Section [4] as mentioned above, 
discusses quantum spin simulation, and Section [5] 
presents some numerical data. 

2 Parallel Tempering 
2.1 Basic PT 

An excellent general survey of PT can be found in 
[2]; for completeness some basic concepts are pre- 
sented. Suppose we have a family {/^(x)} of M 
distributions defined on a state space X each mem- 
ber of which is dependent on some parameter A/c, 
i.e. /fc(x) = /(x|Afc). We assume that the terminal 
parameters Ai and Am are fixed and that imple- 
menting naive MCMC at those values, for example 
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local-variable Metropolis or Gibbs (heat bath) sam- 
pling, will result in slowly and quickly equilibrating 
( "mixing" ) Markov Chains respectively. Ai and Am 
could, for instance, be low and high temperatures; in 
the systems we simulate, A is a parameter that plays 
roughly the same role as temperature. The values 
of the remaining {Afc} are assumed to be flexible in 
this work provided that equilibration gets easier for 
increasing k. We refer to a local MCMC simulation 
at a given parameter value as a chain. 

The method of PT supplements the usual within- 
system Monte Carlo updates with phases during 
which swaps of entire states between chains at dif- 
ferent parameter values are attempted. A particu- 
lar state is sometimes called a replica. In practice, 
to have a non-negligible swapping probability, ex- 
changes are only attempted between neighboring pa- 
rameters {Afc, Afc + i}. An exchange is accepted prob- 
abilistically according to the Metropolis ratio 



spacing for the feedback procedure and in Section 
3.21 to stabilize the algorithm. 



a = min 1, 



/fc(x fc+ l)/fc + l(xfc) 

Jfe(xfc)/fc+i(xfc+i) 



(1) 



It can be shown that the combination of the lo- 
cal MCMC moves and the swapping implements a 
Markov chain on the joint state space X M whose 
invariant distribution is A(x 1 )/ 2 (x 2 ) . . . /m( x m)- 

The advantage of PT is that in theory, it can en- 
able the system to escape from local minima it may 
encounter during the local updates at large A. It is 
desirable that a given replica should drift relatively 
easily between the terminal A as this would suggest 
that the state at Am at the end of a "round trip" in 
parameter space has decorrelated from its value at 
the start. It may appear at first sight that merely 
allowing a reasonable swapping probability would 
suffice. The latter can be achieved by spacing the 
Afc close enough together to ensure this. One can 
be more precise (see, e.g. [2]); it can be shown, by 
expanding log /(x|A) to second order about Afc, that 
the expected log acceptance rate between parame- 
ters k and k + 1 under the equilibrium distributions 
is 

Epog(a)] ~ -7(A fc )(Afc +1 - A fc ) 2 (2) 

where I(X k ) = - £ x f k (x) g!igg&Al | ^ , We will 
use this result in Section [3JJ to determine the initial 



2.2 Feedback-Optimized PT 

As pointed out by [3], making PT work properly 
can be more subtle than merely assuring sufficient 
overlap between chains. Even though the marginal 
swap rate between all neighboring parameters may 
seem reasonable (say 25%,) there can exist higher- 
order bottlenecks in parameter space that effectively 
choke the replica diffusion. A given replica can re- 
peatedly make its way to the bottleneck but only 
rarely pass through it. Choosing the {Afc} to remedy 
this issue is thus not a matter of merely achieving a 
given swapping rate between adjacent chains but of 
ensuring that a large number of replica round-trips 
takes place. 

We now briefly summarize the insights of [3] . Let 
n up {\k) and ndown(^k) denote, for a given run of 
PT, the total number of replica that have visited pa- 
rameter Afc that were drifting "upward" and "down- 
ward" respectively. An upward-drifting replica is 
one that, of the two terminal parameters, has vis- 
ited Ai most recently; a downward-drifting one has 
last visited Am ■ Replica that have not yet visited ei- 
ther endpoint do not contribute to the n up or Udown 
sums. To maximize the rate at which the replica 
diffuse from Ai to Am, the flow fraction 



/(Afc) = 



'up 



(A fc ) 



(Afc) + ndown(^k) 



(3) 



should decrease linearly from Ai to Am 0- In [3], 
an iterative algorithm is described, where, after a 
run of PT with a given set of parameters, new pa- 
rameters are generated from the old ones and the 
measured / so as to (hopefully) make / closer to 
optimal during the subsequent run of PT. In other 
words, the linearly-decreasing / is a fixed point of 
the procedure. The algorithm tends to move the 
parameter values away from where / has a small 
slope towards locations where it drops off sharply. 
For convenience, we restate in Algorithm [T] a ver- 
sion of it appearing in [4] that is somewhat more 
transparent than the original one defined in [3]. 
The procedure can be stopped before the passage 
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Algorithm 1: Feedback-Optimized PT 
Input : {Afe}: Initial parameter set 
M: Number of parameters 
Niter- Number of feedback iterations 
N sweep'- Number of sweeps of PT within an iteration 
Output: {Afe}: Optimized parameter set 
begin 

for i = 1 . . . Nuer do 

ParallelTempering( Afe ) for N swee p steps, calculate / 
Define g(f) such that: 

1. g(f(\ k )) = A fc 

2. Within (Afe, Afc+i), <?(/) is a linear interpolation between g(f(\k)) and g(f(Xk+i)) 

for k = 2 . . . M - 1 do 

L A fe <" 9{{k - 1)1 {M - 1)) 



of N^er iterations if some criterion is specified and 
met; it may also be desirable to increase N sweep from 
one iteration to the next [3J. 

Impressive results in eliminating a flow bottle- 
neck in the planar ferromagnetic Ising model are 
presented in [3J, which suggested that FOPT was a 
natural candidate to alleviate a similar issue we en- 
countered while simulating Quantum Ising glasses 
with PT. Unfortunately, in its initial form, the al- 
gorithm was unstable on these problems: one is- 
sue was that it was too zealous in its placement of 
the parameters around the measured drop-off in the 
fraction. For a given number of chains and simula- 
tion time, this would cause some parameter inter- 
vals to become too wide, resulting in chains where 
no upward or downward-moving replica visit and 
hence breaking the recursion. We suspect this to be 
due to the extreme sensitivity of the systems' behav- 
ior around the bottleneck, and the fact that owing 
to the large system sizes, it was infeasible to run 
PT for the times required to obtain enough statis- 
tics to accurately calculate /. Interestingly, this 
problem seemed to appear even when the number 
of chains was increased dramatically; its occurrence 
was merely delayed by a few iterations. 

Nonetheless, several factors drove us to make 



practical modifications to FOPT, not least of which 
was the sheer impracticality of manually choosing 
good parameter values for a sizable number of very 
large and difficult problems. In the next section we 
detail our implementation improvements. 

3 Robust Feedback-Optimized 
PT 

There were several components that we found to 
be necessary for feedback-optimization to properly 
function on our set of problems, but we emphasize 
that fundamentally, the goal is the same as that of 
the original FOPT, namely achieving a straight-line 
/■ 

3.1 Initial Parameter Spacing 

In some cases, good rules of thumb exist for choos- 
ing parameter spacings in PT; when dealing with 
temperature, for example, a geometric sequence has 
been suggested to be reasonable in certain circum- 
stances [5]. Generally, such guidelines are often 
unclear; indeed this is the point of the feedback- 
optimization idea. In our practical experience 
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though, the initial parameter spacings and espe- 
cially the number of chains can have a substantial 
influence on the efficacy of the feedback procedure 
for a given amount of computation time. In partic- 
ular, our experience has shown that starting with 
an excessive number of chains can cause poor per- 
formance. 

For our initialization strategy, we devised the sim- 
ple method AddChains(), described in Algorithm^ 
Starting from a sparse, linearly-spaced initial set of 
parameters, a short run of PT takes place within 
which the exchange rates are estimated. Parame- 
ters are then added in such a way as to achieve some 
minimum swap rate. The number of initial chains 
can be relatively small; for example for the size N 
spin systems we considered, we typically used \J~N / 4 
chains (determined heuristically) or less. When PT 
is run with such a low number of parameters, the 
swapping probability is bound to be very small for 
quite a few parameter intervals. In fact if we tried to 
estimate these swapping probabilities by histogram- 
ming the number of swaps occurring in a typical- 
length run, we would probably end up with zeros, 
especially if the measurement takes place only after 
a certain number of equilibration (sometimes called 
burn-in) steps. However it is possible, instead of 
averaging the number of swaps themselves, to aver- 
age the log Metropolis ratios between neighboring 
chains each time swaps are attempted. Specifically, 
if during the nth swap attempt, we compute the 
ratio: 



(™) • i i 

a k,k+i = mm 1 X ' 



/fc(Xfc + i)/fc + i(Xfc) 
/fe(xfc)/fc+l(x fe+ l) 



and if N swap is the number of swap attempts, we 
define 



L 



k,k+l 



N. 



swap 



N Bmar 

E 

71=1 



log I 



x k,k+l) 



(4) 



It is not necessary to carry out PT to full equi- 
librium to usefully estimate the log swap rates us- 
ing crude estimates calculated over a relatively 
short PT run can satisfactorily inform how many 
chains need to be added to intervals that are too 
wide, even when the estimated swap rates are very 



low (say 10 -30 or so.) Determination of the num- 
ber of needed chains needed in each interval is then 
done using the second-order approximation in ([2]). 

A run of the initialization routine would thus pro- 
ceed as follows: begin with a minimum swapping 
threshold a m i„ (e.g. 20%), and an initial number 
of chains Mq with Am = Am- After measuring the 
swap rates resulting from a uniform parameter grid, 
"grow" the number of chains (to a final value of 
Mi) attempting to meet the threshold; practically, 
the list of new parameters can be generated by pop- 
ulating it with the initial values, appending the new 
ones to the end, and sorting it when finished. 

Algorithm 2: AddChains 

Input : Mq: Initial number of parameters 
Ai,Am : Terminal parameters 
a m in'- Minimum target swap rate 

Output: M\: Number of final parameters 
{Afe}: Final set of parameters 

begin 

// Initial parameter list; linear 

spacing 
for A; = 2 ... M - 1 do 

L A fc = Ai + fc(A M - Ai)/(M - 1) 
Run PT with {A^} for N sweep moves, 
obtaining {L ktk+1 } 

II Append parameters if intervals 
too wide 

Mi <- M 

for k = 1 . . . M - 1 do 



log(a mi „) 

AA <- (Afc+i — Xk)/(R - 
for j — 1 . . . R do 

|_ Ami+j «- Afc + j A A 
Mi <- Mi + R 
{A fc } <- sort({A fe }) 



The procedure can be repeated with the new pa- 
rameters, though we have found this to be unneces- 
sary. Once the initial set has been determined, the 
feedback phase, discussed in the next section, can 
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begin. 

3.2 Feedback Procedure 

The first problem to tackle in the feedback process 
is that of instability due to unreliable / data or ex- 
treme problem sensitivity. As alluded to in Section 
12.21 if / possesses a severe drop-off, the original algo- 
rithm tends to over-concentrate the available chains 
around it, resulting in some intervals being too wide 
for any replica to pass through at all in the next it- 
eration. / thus becomes mathematically undefined 
due to n up = n c i OW n = 0, and the algorithm malfunc- 
tions. To redress this, an approach we implemented 
was to smooth / in such a way that the parameters 
do not move too quickly. We defined f S mooth(Xk) = 
(1 — w)f(Xk) + wL(Xk), where to is a weight be- 
tween and 1, and L(X k ) = 1 - (k - 1)/(M - 1), 
i.e. it decreases linearly from 1 to in M steps, or 
alternatively, it is the fixed point of the optimiza- 
tion algorithm. If w = 0, we recover the original 
FOPT; at the other extreme, if w — 1, the feedback 
does nothing to the parameters. At intermediate 
values of w, applying feedback to f smooth instead of 
/ has a damping effect on the parameters' motion. 
It may be advantageous to reduce w between subse- 
quent feedback iterations as the initial estimates of 
/ become better. 

In our experiments, the smoothing trick on its 
own did not always eliminate the problem of overly- 
wide intervals, though it certainly postponed this 
unfortunate behavior. Consequently, we developed 
a method that would override the feedback routine's 
output if it was likely that the resultant swap rates 
were so low that they would cause the issue dis- 
cussed above to occur. We called this the post- 
process step; it is summarized in Algorithm [3l Us- 
ing the approximate relation between the log 
accept rate and the interval width, this procedure 
uses the accept rates estimated during the iteration 
to predict what the accept rates would be due to 
the new parameters, i.e. those resulting from the 
feedback. If, for a given interval, the predicted rate 
is lower than a threshold a m i n , its width is thresh- 
olded. This method may result in the final interval 
being too wide, and since the endpoints are assumed 



to be static, more parameters are added there to 
compensate, again with the objective of attaining at 
least a m in in the resultant new intervals. Hence the 
total number of chains can continue to grow during 
the feedback procedure. Of course a m i n need not be 
the same as the one used in the AddChains phase; 
indeed we have found that it should be kept as low 
as possible (say around 5%) in order to minimize in- 
trusiveness on the parameter search. Its role should 
be viewed exclusively as one of "life support." 

A somewhat counter-intuitive fact we observed 
was that in the initial iterations at least, it can 
sometimes be favorable to use the normalized func- 
tion n 'down(^k) = 1 _ n down (X k )/n down (X M ) or its 
smoothed version as a surrogate for /. Due to the 
rapid equilibration at large X kl this quantity tends 
to stabilize into its final form more quickly than / 
does; using it instead of / makes the implicit as- 
sumption that the analogous function n' up (Xk) = 
n-up(Xk) / riup(Xi) will, when it properly converges, 
be symmetric to it, i.e. n' up w 1 — n' down . This as- 
sumption may end up being incorrect, but it seems 
in some cases to be more reliable than using faulty 
/ data in the beginning. As with the smoothing 
of /, as the iterations advance and the bottlenecks 
are more accurately located, one can start using the 
actual / in the feedback. 

A final implementation aid we noticed to help 
speed up the convergence of / considerably within 
each feedback iteration was initializing all the chains 
with a ground state (global minimum) if it is known. 
If the ground state is unique, as it was in the systems 
we were considering, this is the correct equilibrium 
behavior at the low A values. 

The next section discusses the model we per- 
formed our experiments on. It can be skipped by 
non-physicists. 

4 Quantum Ising Glass 

Equilibrium simulation of quantum spin systems can 
be done by application of the Suzuki- Trotter (ST) 
framework [6] ; the quantum system's partition func- 
tion is approximated by a partition function corre- 
sponding to that of a classical Ising model consisting 
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Algorithm 3: Post-Process Parameters 
Input : {Afc}: parameters at start of feedback 
iteration 

{X' k }: parameters after feedback 

{Lk.k+i}'- log swap rates measured 
using parameters {Afc} 

a m in- minimum target swap rate 

M: initial number of {Afc} 
Output: {A^}: parameters after post-process 

M: final number of parameters {A' fc '} 

begin 

// This stage handles {Afc} intervals 
that are too wide 

K <- Ai 

for k = 1 . . . M - 2 do 

I <- max{j|Aj < A' fe '} 

AA «- + l - Xl 



A Am 

AA' 



A\\/\og(a min )/Lij + i 
Afc 



if AA' > AX max then 



else 



X'l + AA, 



A', 



fc+i 



// This stage adds chains to the 
final interval if needed 

I <- max{j|Aj < X'm_ 1 } 
AA <- A;+i - A; 



M i d <- 



A\\/log(a m i n )/Lu + i 



M 



A <- X'l I _ 1 
while A < A', 



AA.„ 



x" 



M . 



do 



A 



I <— max^'lAj < A} 
AA <- Xi+i - Xi 



\og(a min )/Li M 



X^ 
M 



A 



A + AX max 
-M + l 

M 



of multiple ferromagnetically-coupled copies ( "Trot- 
ter slices") of the original system. If one can draw 
equilibrium samples from this effective classical sys- 
tem, expectations with respect to the quantum sys- 
tem can be estimated. Representations using more 
slices give a better approximation but are more com- 
putationally demanding to simulate. 

Our aim in these simulations was estimation of 
the minimum energy gap as a function of the quan- 
tum adiabatic parameter X G [0,1]; the quantum 
Hamiltonian at a given A is given by H(X) — 
A{X)Hb + B(X)Hp, where Hb and Hp are purely 
quantum and classical Hamiltonian operators re- 
spectively, and the functions A(X),B(X) are such 
that the system is classical and quantum for A = 
and A = 1 respectively. 

Omitting unnecessary details, the Hamiltonians 
Hp came from a specific class of non-planar Ising 
Glasses. Our gap calculation methodology was the 
same one appearing in [5], which crucially depends 
on obtaining accurate estimates of the spin correla- 
tion function. For large system sizes, naive Monte 
Carlo algorithms suffer from equilibration issues at 
small values of A, hindering the estimation. It 
seemed natural that PT would assist in this prob- 
lem; the set of target distributions fk simply cor- 
responded to the ST representations of the quan- 
tum systems at the {Afc}. It turned out that for 
many such problems, PT can have serious problems 
of the sort discussed in this paper; their presence 
corresponded to the existence of first-order quantum 
phase transitions. Tuning the parameters by hand 
is very difficult for such problems; small changes to 
the parameters had dramatic effects on /. A PT 
optimization method was thus essential for simula- 
tions we were interested in. As an added benefit, 
the method we implemented tends to concentrate 
the parameters precisely around the point we were 
most interested in, namely where the spectral gap 



is minimum. 



5 Experiments 

The PT optimization algorithm discussed in this pa- 
per was used to determine parameter spacings for 
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hundreds of quantum spin systems, which, in their 
approximate representation, correspond to complex, 
binary-valued systems of up to 32768 variables 
(these resulted from representing a 128 quantum 
spin system with 256 Trotter slices.) The param- 
eters were determined individually for each instance 
as it was observed that a set optimized for one in- 
stance was likely to perform poorly on other in- 
stances in the same class of problems. 

For stepwise illustration, let us consider a partic- 
ular 16-spin Quantum Ising Glass represented with 
128 Trotter slices to yield a system size of N — 2048. 
This specific instance was extremely difficult to sim- 
ulate with PT owing to a sharp first-order quantum 
phase transition. If the parameters were spaced so 
as to achieve a uniform swap rate, a dramatic flow 
bottleneck manifested itself. It was also impervious 
to the initial FOPT algorithm, as well as numerous 
other approaches which seemed promising on other 
problems. 

In order to allow a fair comparison, in both the 
original FOPT and in the version with our improve- 
ments, we applied the AddChains initialization rou- 
tine to determine the starting parameters, and the 
states were all seeded with the ground state. 

First we demonstrate how Algorithm[2]performed. 
AddChains was called on this system using a target 
swap threshold of 18%, 70000 sweeps of PT, and 20 
chains to start. Figure [T] shows both how the accep- 
tance rates responded to the interspersion and the 
particular locations in A space that parameters were 
added. The estimator of the swap rates at the ini- 
tial parameter spacing produced L as low as —130; 
thus in the 70000 sweeps we performed, actually ob- 
serving any swaps in such an interval is virtually 
impossible. The lower plot in Figure Q] shows where 
parameters were added to achieve the target; note 
the concentration around A ~ 0.6. Our rough algo- 
rithm succeeded in achieving the target rate in all 
intervals. 

In Figure [2] a run of the initial FOPT algorithm 
and our stabilized version are shown. For our algo- 
rithm, we used the fairly strong smoothing value of 
w = 0.75; the number of PT sweeps was constant 
at 200000 in each iteration. For PostProcess, the 
minimum interval swap rate was set to 3%. The 
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Figure 3: (Color online) Progress of the upward- 
moving fraction / when our algorithm is applied 
to the N = 2048 Ising spin system discussed in 
the text. Note the abrupt decline when PT is run 
with the results of AddChains. Subsequent iter- 
ations progressively make / closer to the desired 
form, namely a straight line. 200000 PT sweeps 
were used to gather the / statistics at each itera- 
tion. 



details can be read from the caption in the Figure, 
but the essential point is that the original algorithm 
aggressively spaces the parameters in such a way 
that the recursion cannot proceed past a single it- 
eration. For all but the endpoint parameters, a PT 
run of realistic length using such {A} would give 
n up = Tidown = 0. For illustration, in Figure [3] we 
show the progression of / (not f smooth ) at differ- 
ent iterations of our method. Even on this highly 
troublesome instance, the parameters were finally 
spaced in such a way that enough round-trips took 
place to allow accurate estimation of the minimum 
excitation gap. 

To demonstrate efficacy on the typical system 
sizes we were ultimately interested in simulating, in 
Figure [4] we present results of the evolution of / 
on an N = 24576 spin model resulting from repre- 
senting a 96 spin quantum system with 256 Trotter 
slices. The results were obtained using the same 
simulation parameter settings as those used for the 
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Figure 1: (Color online) Illustration of the AddChains algorithm on an N = 2048 Ising spin system. In the 
bottom graph, we show the grid of initial linearly-spaced parameters and the resultant A after AddChains. 
The top graph plots the initial and final estimated swap rates after 70000 sweeps of PT (in the left side of 
the plot, the blue and red lines are superimposed.) Around the region with A G [0.5, 0.8] the initial estimated 
swap rate plunged as low as w e -79 ; after the parameter addition, all intervals had swaps taking place over 
the criterion rate, shown by the black dotted line. The initial grid consisted of 20 parameters, to which 46 
were added by the algorithm. 
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Figure 4: (Color online) Progress of the upward- 
moving fraction / when the algorithm is applied to 
the N = 24576 Ising spin system discussed in the 
text. 



previously-discussed 16-spin system. The resultant 
parameter spacing again allowed sufficiently good 
PT performance for us to obtain high-quality Monte 
Carlo data. 

6 Conclusion 

We have presented improvements to the problem of 
iterative parameter selection for PT. Our contribu- 
tion has consisted of three parts: first, an initial- 
ization strategy to place parameters in a reason- 
able manner when no a priori information is known 
about the problem domain; second, the introduction 
of damping mechanisms to the FOPT algorithm of 
[3] that assist in tackling the problem of instability; 
third a post-processing procedure that prevents the 
algorithm from malfunctioning. We demonstrated 
the effectiveness of the method by showing experi- 
ments on a particularly difficult quantum spin sys- 
tem represented as a classical Ising model with 2048 
spins. We have tested our algorithm on much bigger 
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Figure 2: (Color online) Illustration of the progression of the parameters using the original FOPT algorithm 
and the one with our modifications on the N = 2048 Ising spin system discussed in the text. Both algorithms 
began with the A spacings generated by AddChains; all replica were initialized with the ground state. At 
left, we see that in a single iteration, FOPT spaces all the parameters around A 0.6, resulting in very 
wide intervals between the endpoints. The algorithm could thus not proceed past this point because the 
replica in the middle could no longer visit either terminus; consequently / was mathematically undefined. 
Our algorithm's output appears at right. At each iteration step the resultant A using a smoothed version of 
/ in FOPT is shown in blue; the A returned when those parameters are passed to the PostProcess algorithm 
appear in red. Note that to keep the swap rate over the threshold of 3%, PostProcess often "pulled" 
parameters back if it predicted that the interval was too wide. 
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systems with satisfactory results. 
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